RNA-like polymer model: exact calculation on the Bethe lattice 
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We consider a lattice polymer model (random walk), in which the walk is allowed to visit lattice 
bonds at most twice. Such a model might have some relevance to describe statistical properties of 
RNA molecules. In order to mimic base pairing, we assign an attractive energy term to each doubly- 
visited bond, and a further contribution to each pair of consecutive doubly-visited bonds. The latter 
term is expected to mimic the stacking effect, whereas no effect of sequence, that is, of chemical 
specificity, is taken into account. The phase diagram is worked out exactly on a Bethe lattice, in a 
grand-canonical formulation. In the single molecule limit, the system undergoes two different phase 
transitions, upon decreasing temperature: a O-like collapse from a swollen "coil" state to a "molten" 
state, with a low fraction of doubly-visited bonds, and subsequently to a "paired" state, with empty 
or doubly-visited bonds only. The stacking effect drives the latter transition from second to first 
order. 

PACS numbers: 05.70.Fh, 64.60.Cn, 61.25.Hq, 87.15.Aa, 87.14.Gg 



I. INTRODUCTION 



Lattice self-avoiding walks, i.e., random walks that are 
forbidden to visit lattice sites more than once, have long 
been employed for modeling linear polymers in a good 
solvent A short range interaction between noncon- 
secutive monomers has been also considered, in order to 
represent either Van der Waals attractive forces between 
monomers or the effective result of solvophobic interac- 
tions. Such interactions cause the well-known transi- 
tion from a swollen coil at high temperature to a compact 
globule at low temperature (2j. The Bethe approxima- 
tion [E i.e., the exact solution on the Bethe lattice, 
has been shown to reproduce with reasonable accuracy, 
and with negligible computational effort, the phase be- 
havior of such basic model (O model) and also of slightly 
more complicated polymer models [E 0, H, [E EE] ■ 

Several variations of the model have been proposed, 
in order to describe different physical phenomena. In 
particular, it is possible to relax the self-avoidance con- 
straint, allowing the walk to visit lattice bonds at most 
twice. If the polymer chain is assigned an orientation, 
and lattice bonds are allowed to be doubly-visited only 
by opposite chain segments, the model is denoted as 2- 
tolerant trail [ll|, EH- This model may be useful to 
investigate configurational statistics of RNA molecules, 
whose importance in molecular biology is being more 
and more recognized 0, EI EE El- Similar to DNA, 
a RNA molecule is a long polymer chain composed of 
four different monomers (bases), adenine, cytosine, gua- 
nine, and uracil, which are pairwise complementary (i.e., 
adenine- uracil and cytosine-guanine pairings are energet- 
ically favored by formation of hydrogen bonds [lj} ) ■ At 
a coarse-grained level, one can neglect the differences 
among bases, and assign an attractive (contact) energy 
for each base pairing, that is, for each doubly-visited 
bond. 



investigated the previously described model, performing 
accurate Monte Carlo simulations on the face-centered 
cubic (fee) lattice, fully taking into account the excluded 
volume effect, and showing the existence of a continuous 
phase transition (similar to the collapse) from a high 
temperature state in which the RNA is almost completely 
unpaired to a low temperature state with a significant 
fraction of paired bases (the so-called molten phase). 

In this work, we first verify that the Bethe approxi- 
mation, which is able to take into account excluded vol- 
ume at a local level @, predicts a 0-like transition as 
well. Moreover, we consider an extended model with a 
more general energy function: We assign a specific en- 
ergy contribution to consecutive paired bases, without 
intermediate branching, in order to mimic the so-called 
stacking effect [lE EE EE]- Indeed, the stacking effect, 
mostly related to hydrophobicity [T^], is claimed to be 
energetically more relevant than base pairing [l~7j . and 
however has great importance for algorithms attempt- 
ing to predict the secondary structure of given RNA se- 
quences [2(j. In the statistical physics literature, models 
with ordinary pairing energy only [ll|, EE I2ll-f22l . HE HI] , 
or with stacking energy only (25[, or both [26l [have been 
considered, but the relative importance of the stacking 
effect with respect to base pairing has been scarcely in- 
vestigated 27]. On the contrary, we specifically address 
the issue of stacking, assigning different relevance to one 
of the two interactions, by means of an adjustable param- 
eter. We observe that our model predicts, in the low tem- 
perature region, a fully base-paired phase. Such phase, 
which does not at all correspond to a unique secondary 
structure, might describe -with some cautions- an aver- 
age "native" state. The phase transition to the molten 
phase ( "denaturation" ) turns out to be continuous, for 
the ordinary pairing energy model, but, upon adding 
even a small stacking energy, it turns out to change into 
first order. 



Quite recently, Baiesi, Orlandini, and Stella [Tl| have The paper is organized as follows. In Sec. [TTJ we intro- 
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duce the model in some more detail and give an overview 
of the Bethe lattice calculation. In Sec. IIIII we work out 
the phase behavior of the model, with particular atten- 
tion to the single-molecule limit, and in Sec. IIVI we dis- 
cuss the results, adding some concluding remarks. Ap- 
pendices |A] and [B] are devoted respectively to a deriva- 
tion of the equilibrium free energy and of the recursion 
equations for the Bethe lattice, while in Appendix [Cl we 
report the analytical calculation of the O-like transition 
temperature. 



II. THE MODEL AND THE BETHE LATTICE 
CALCULATION 

As previously mentioned, our polymer model is a self- 
avoiding walk, which is exceptionally allowed to visit 
each lattice bond (at most) twice, but is not allowed to 
self-intersect. We can imagine roughly the following pic- 
ture. Each segment of the walk represents a monomer 
(base), whereas empty lattice bonds represent the sol- 
vent. Doubly-visited bonds (i.e., bonds occupied by two 
segments) represent paired bases, yielding an attractive 
energy — (/3 — 7), with (3 > 7 > 0. Moreover, every pair 
of consecutive doubly-visited bonds yields an additional 
attractive contribution —7, providing a rough descrip- 
tion of the stacking effect. It turns out that j3 is the 
maximum pairing energy, obtained by consecutive base- 
pairing, and will be taken as the energy unit. According 
to the grand-canonical formulation, a chemical poten- 
tial fj, is associated to each monomer, while the solvent 
chemical potential is conventionally assumed to be zero. 

Let us spend a few words to notify that the coarse- 
grained model introduced above includes some degree of 
inconsistency. In particular, to be more precise, segments 
of the walk ought to represent stretches of the order of the 
persistence length of the polymer chain. Unfortunately, 
the persistence length turns out to be much different for 
single- or double-stranded RNA stretches (being much 
larger in the latter case), but we nonetheless describe 
both cases within a single lattice bond. We expect that 
such inconsistency should not alter the qualitative phase 
behavior of the model, since this is what happens for sim- 
ilar models of DNA, as it has been also noted in Ref. [111. 

A configuration of the system can be defined by spec- 
ifying the number of segments on each lattice bond. 
Therefore, we define a configuration variable rij = 0,1,2 
(occupation number) for the i-th lattice bond. Of course, 
such configuration variables are not independent, but 
have to satisfy some constraints. In particular, on each 
set of bonds of a given site, we impose the following con- 
ditions: (i) the total number of segments must be even; 
(ii) there cannot be more than 2 unpaired segments; (iii) 
if only 2 segments are present, they must be unpaired. 
Tab. U exemplifies the constraints in more detail, for the 
simple case with coordination number equal to 4, but 
generalizing to any coordination number is straightfor- 
ward. 



Constraint (i) is a simple connectivity constraint, stat- 
ing that the chain does not terminate after a finite num- 
ber of segments. Constraints (ii) and (iii) state that, if 
2 unpaired segments come to a given site from differ- 
ent lattice bonds, they either pair each other (so that 
at least another bond is doubly occupied) or they are 
consecutive along the chain (all other bonds are empty). 
More precisely, constraint (ii) implies that unpaired chain 
stretches behave like self-avoiding walks, which cannot 
visit a lattice site more than once, unless they get paired. 
Constraint (iii) deserves some more discussion. A config- 
uration with only two paired segments could represent 
a "terminal loop" , in which the chain bends onto it- 
self, to form a hairpin. In principle, such "zero-length 
loops" should be allowed by a basic 2-tolerant polymer 
model. Therefore, constraints (iii), which forbids them, 
can be considered either as a further detail, which defines 
a slightly different model, or as an approximation to the 
original one. Such approximation, which is conceptually 
independent of the subsequent approximate (Bethe) sta- 
tistical treatment, has been firstly taken for technical rea- 
sons, in order to simplify the analytical calculations. We 
shall shortly discuss this technical issue in the following. 
By now, we only observe that there is actually a phys- 
ical argument, which suggests that the modified model 
might be even a bit closer to the real system. In fact, 
for energetic reasons, terminal loops must have a mini- 
mum length of four bases, and experiments show that, 
in real RNA, typical hairpin loops are just of this kind 
(tetraloops) [13]. Therefore, a hairpin loop should have 
a finite, though small, entropy, which cannot be taken 
into account by a zero-length loop in the coarse-grained 
model. 

Assuming a coordination number k + 1, the Hamilto- 
nian can be formally written as 

{io,.-,ik} i 

where the former sum runs over all sets of bonds 
{io, of all lattice sites, and the latter over all 

bonds i. Single-bond energy terms h n take into ac- 
count pairing energies and chemical potential contribu- 
tions, and can be defined as follows 

ho = 0, (2) 
hi = -fi, (3) 
h 2 = _(/3_ 7 )_ 2 /i. (4) 

Many-bond terms i?n ,...,n fe take into account the con- 
straints, assigning infinite energy penalties to forbidden 
configurations, and the stacking energy contributions. A 
definition of these terms would be quite cumbersome, 
from an analytical point of view, so that we can assume 
they are defined by a table like Tab. fl] 

Let us briefly return to discuss constraint (iii), which, 
as previously mentioned, disallows zero-length hairpin 
loops, i.e., configurations with only two paired segments 
on the set of bonds of a given site. It turns out that, 
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TABLE I: Configurations of a set of bonds of a given 
lattice site (left column); occupation numbers m for each 
bond i — 0,...,k (mid columns); total number of seg- 
ments Nn ,...,n k = 52i=o Ui an< ^ corresponding energy term 
H no ,...,n k (right columns). Notice that: graphical representa- 
tions are limited to four bonds; only configurations with even 
number of segments are reported, because odd numbers are 
forbidden (the corresponding energy terms are oo); energy 
terms are invariant under bond permutations. 
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if we allowed such a configuration, local constraints of 
the form H no ^ ^ nk would also allow, for instance, con- 
figurations with two paired segments disconnected from 
everything else, or even paired stretches of any length. 
In this way, we would not study an infinitely long poly- 
mer, but a mixture of polymers of different lengths, and 
this would be a completely different system. In order to 
avoid constraint (iii) , we would have to exclude the afore- 
mentioned undesired configurations, and we would need 
a more complicated treatment. This will be the subject 
of a future work. 

Let us now introduce the Bethe approximation. Ba- 
sically, it consists in replacing the single walk on the 
regular lattice by a gas of walks, with the same self- 
avoidance constraints and interactions, on a Bethe lat- 
tice, having the same coordination number as the regu- 
lar lattice one. In older literature, a Bethe lattice was 
simply understood to be the inner region of an infinite 




FIG. 1: Sketch of a Bethe lattice with k = 2. 

Cayley tree psj . Nevertheless, it has been subsequently 
shown that the thermodynamic behavior of a Cayley tree 
is strongly affected by the presence of a boundary, and 
the exact solution for this system does not agree with 
the Bethe approximation [2{|. More recently, it has 
been recognized that the Bethe lattice has to be a ho- 
mogeneous, boundary-less structure (so that the ther- 
modynamic properties of the system can be worked out 
by suitable self-consistence equations @), which also al- 
lows for the presence of macroscopically large loops [3(| • 
The Bethe lattice is thus better defined as an ensemble 
of random graphs with fixed coordination number [3ll ]. 
which are locally treelike (in the sense that loop length 
is C(ln JV), where Af is the number of nodes), and whose 
thermodynamic behavior is governed by the variational 
Bethe free energy The free energy minima can still 
be determined by solving a suitable recursion relation for 
the so-called partial partition functions [1, [H[ . We shall 
address this issue in Appendix O Hereafter, we just give 
an intuitive derivation, based on the treelike nature of the 
system. Let us consider for instance the Bethe lattice de- 
picted in Fig. [2 and the right part of the system, starting 
with the bond denoted by 0. Since loops connecting the 
two parts are (with high probability) infinitely long in 
the thermodynamic limit, we can imagine that the two 
parts are actually disconnected branches and that we can 
thus define a partial Hamiltonian, obtained by Eq. ([I]) re- 
stricting the sum to bond variables in one branch. The 
corresponding partial partition function W„ can be com- 
puted by summing the Boltzmann weights of the partial 
Hamiltonian over the configurations of the branch except 
the bond. Actually, it is convenient to work with a nor- 
malized partial partition function w na oc W no , such that 

2 

n =0 

The normalized partial partition function ui„ repre- 
sents, as a function of no, the probability distribution 
of the configuration variable "in the absence of the other 
branch" . Let us now observe that, in the thermodynamic 
limit, and in the hypothesis of a homogeneous system, the 
subbranches attached to the bond should be equivalent 
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to main one, so that one can write the recursion equation 

2 2 k 

w no = q~ 1 e- hn ° ■■■ e-^o^i.-.n* Y[w ni . (6) 

m— n k — i— 1 

The sum runs over configuration variables of bonds at- 
tached to the bond (ni and ri2, in our example), the 
energy terms h no and H 7lQ ni ^ ^ nk are assumed to be nor- 
malized to temperature, and q is a normalization con- 
stant, imposed by Eq. ([5]). A more explicit form of the 
recursion equation is given in Appendix [Bl where the 
specific energy terms H na ^ ni ^.^ nk of our model are taken 
into account. The recursion equation can be solved nu- 
merically by a simple iterative algorithm. All equilibrium 
properties of the system can be derived from the knowl- 
edge of the partial partition function. 

First of all, we can compute the probability distribu- 
tion p n of a bond configuration variable n, by considering 
the operation of attaching 2 branches to the given bond. 
We obtain 

Pn = z^e^wl, (7) 

where 

2 

z = J2e h "wl (8) 

provides normalization. The average number of segments 
per bond, which we shall briefly refer to as density in the 
following, can be evaluated as 

2 

p = ^ np n =pi+ 2p 2 . (9) 

?!=0 

The density p is the main order parameter for our system. 
As a secondary order parameter, we evaluate the fraction 
of paired segments 

4> = 2 P 2/p- (10) 

The grand-canonical free energy (grand-potential) per 
bond lo can be determined as 

2\a.q-(h-V)\rxz 

"= fe + l ' 

where q is the normalization constant of the recursion 
equation © and z is given by Eq. The derivation 
of this expression requires some manipulations and is re- 
ported in Appendix [XJ From the knowledge of the grand- 
potential one can derive all other thermodynamic prop- 
erties, and determine thermodynamic stability for each 
phase. 

III. THE PHASE DIAGRAM AND THE 
SINGLE-MOLECULE LIMIT 

In the framework of a grand-canonical formulation, the 
phase diagram can be described as a function of temper- 
ature and chemical potential. For a polymer, the lat- 
ter controls the average chain length. For example, in 
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FIG. 2: Chemical potential-temperature (p//3 vs 1//3) phase 
diagram for the zero stacking case (7 = 0, upper graph) and 
a nonzero stacking case (j//3 = 0.2, lower graph). Solid lines 
denote first order transitions; dashed lines denote second or- 
der transitions. The ordinary dense phase is denoted by I, 
whereas the fully paired dense phase is denoted by II. The 
zero density phase is left blank. The insets display the re- 
gions enclosed in the small rectangles. 



the simple model, there exists a phase transition line 
p = (f(T) at which (for increasing p values) the aver- 
age length either diverges continuously (for temperatures 
higher than some temperature O) or jumps discontinu- 
ously to infinity (for temperatures lower than O) [l[ . The 
transition line is identified as the thermodynamic limit 
of a single chain, so that we denote it as the "single- 
molecule" limit. Alternatively, the system can be de- 
scribed in terms of a segment density p, and one obtains 
p = for p < ip(T) and p > for p > ip{T). The 
transition is second order for T > 9 and first order for 
T < 6. In the limit p — > ip(T) + , the properties of the 
dense phase approach those of a single chain, and, in par- 
ticular, the segment density p is a measure of the chain 
compactness. Therefore, the tricritical point (<d,tp(®)), 
known as O point, represents a coil-globule collapse. 

We present grand-canonical phase diagrams of our 
Bethe lattice model for the case of zero stacking ef- 
fect (7 = 0) and for a case of nonzero stacking effect 
il/P — 0.2), which show qualitatively different behav- 
iors. Let us recall that 7//? quantifies the ratio between 
the neat stacking energy 7 and the total effect of pairing 
and stacking (the simple pairing energy is (3 — 7). We 
shall shortly denote 7//? as stacking ratio in the follow- 
ing. We set k = 11 (coordination number = 12), expect- 
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ing to approximate the fee lattice. Let us consider the 
zero stacking case first. The phase diagram is displayed 
in Fig. [2] (upper graph), where the temperature variable 
is 1//3 and the chemical potential variable is /U.//3. We 
find three different phases: a zero density phase (O), an 
ordinary dense phase (I), and a fully paired dense phase 
(11). The zero density phase is characterized by p = 0. 
Since only a vanishing fraction of bonds is occupied in 
this phase, also the grand-potential per bond u> vanishes, 
and, for the same reason, the fraction of paired segments 
4> is undefined. The I phase is characterized by < p < 2 
and < <j> < 1) he., it is a dense phase which possesses a 
finite fraction of paired segments. We can roughly com- 
pare it to the dense phase of an ordinary 6 model. Fi- 
nally, the II phase is characterized by < p < 2 and 
4> = 1, i.e., it is a dense phase in which every segment is 
paired. 

The transition line between the O and I phases turns 
out to be partially first and partially second order. The 
two regimes are separated by a tricritical point. As ex- 
plained in Appendix [U] it is possible to determine ana- 
lytically the equation of the second order line, 



p = — In fc, 
and the location of the tricritical point 



(3 = ki 



k + (k - 1) e~i 



(12) 



(13) 



It has been pointed out that, for this kind of (2-tolerant) 
polymer models, the tricritical point exhibits peculiar 
values of the critical exponents, that are different from 
those of the ordinary self-avoiding walk with attractive 
interaction, and suggest a linear-to-branched polymer 
transition [33j . Evidences of such a behavior will be ob- 
served also in our model. Nevertheless, since this point 
still corresponds to a continuous collapse and, since ex- 
ponent differences cannot be detected at a Bethe approx- 
imation level, we shall all the same speak of a O point in 
the following. 

In the dense region (p > 0) a second order transition 
line separates the I and II phases. This line joins to the 
transition line with the O phase at a critical end-point. 
The latter corresponds to another continuous conforma- 
tional transition for the single molecule. The behavior is 
different, in the presence of the stacking effect, as shown 
in Fig. [5] (lower graph). The same three phases O, I, and 
II discussed above are present, and also the high temper- 
ature region of the phase diagram is qualitatively similar, 
although we can observe a lower temperature, in agree- 
ment with Eq. (|13p . On the contrary, the I-II transition 
line turns out to be partially second and partially first 
order, giving rise to a tricritical point in the dense re- 
gion. In this way, the critical end-point disappears, and 
is replaced by a triple point, which corresponds to a dis- 
continuous transition in the single-molecule limit. 

Let us now investigate this limit in more detail. First of 
all, we consider the fraction of paired segments <f>, com- 
puted for p tending to the transition line from above, 




FIG. 3: Fraction of paired segments as a function of tem- 
perature ((/> vs 1//3) in the single-molecule limit, for different 
values of the stacking ratio 7//3. Dashed lines denote disconti- 
nuities; a thin solid line connects transition values. The inset 
displays the region enclosed in the small rectangle. 



as a function of temperature. The results are reported 
in Fig. [31 for three different values of the stacking ra- 
tio j/(3 = 0,0.2,1. For all cases, we can see that 4> 
is rigorously zero above the O temperature. In this 
regime, which we can denote as coil state, the poly- 
mer behaves like an ordinary self-avoiding walk without 
self-interaction. Upon decreasing temperature below the 
point, the fraction of paired segments begins to in- 
crease, revealing formation of contacts. We can iden- 
tify this regime as the molten state. As previously men- 
tioned, the O temperature decreases, upon increasing the 
stacking energy. Upon further decreasing temperature, <j> 
reaches the saturation value = 1. In this regime, which 
we simply denote as paired state since all segments are 
paired, we can imagine our system as a branched dou- 
ble chain. In this sense, we can identify this phase as a 
"native" RNA-like state, although it does not at all corre- 
spond to a single configuration, as it will become clearer 
later. As previously mentioned, the molten-paired transi- 
tion is continuous in the zero stacking case, but becomes 
first order in the nonzero stacking cases. More precisely, 
we observe that the stacking energy needed to change 
the order of the transition is very small but finite, as 
suggested in Fig. [3] by the thin line connecting the tran- 
sition values of <f>. 

We also investigate the temperature dependence of the 
entropy per segment, which can be computed as follows. 
The grand-potential per bond can be written as 



w = / - pp, 



(14) 



where / is the Helmholtz free energy per bond. As pre- 
viously mentioned, to vanishes at the O phase boundary, 
so that in this case p coincides with the Helmholtz free 
energy per segment 



A* = f/P- 



(15) 



Remembering that our energies are normalized to tem- 
perature, the equation of the O phase boundary can be 
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FIG. 4: Entropy per segment as a function of temperature 
(s/p vs 1/(3) in the single-molecule limit, for different values 
of the stacking ratio 7//3. Dashed lines denote discontinuities; 
thin solid lines connect transition values. The inset displays 
the region enclosed in the small rectangle. A dash-dotted line 
indicates the entropy value lnfc/2. 



written as 

H/P = <p(lf0), (16) 

where the function <p is known numerically with high pre- 
cision. The entropy per segment (in natural units) can 
thus be easily determined as 

a/p = -<p'(l/0), (17) 

where cp 1 denotes the first derivative of ip. We report the 
results in Fig. fU for the usual values of the stacking ra- 
tio j/{3 = 0,0.2,1. For all cases, in the coil state, the 
entropy per segment is rigorously independent of tem- 
perature and equal to In k, as it can be easily derived by 
Eqs. ([in]), H2) and (fTT)) . This value characterizes an or- 
dinary self-avoiding walk on the Bethe lattice @ . In the 
molten state, the entropy starts decreasing (as tempera- 
ture decreases), more and more rapidly, upon increasing 
the stacking effect. Finally, in the paired state, the en- 
tropy is almost constant and its value turns out to be 
slightly larger than (lnfc)/2, which would characterize a 
self-avoiding double chain. The excess entropy with re- 
spect to this value, which is due to branching, tends to 
zero as temperature goes to zero, and decreases upon 
increasing the stacking effect. 

According to the results reported so far, the low 
temperature phase might appear as almost completely 
quenched. The following analysis of the average length 
of double chain stretches (which we shall shortly refer 
to as stacking length in the following) demonstrates that 
this is not the case. Let it denote the probability that, 
given a lattice bond occupied by two paired segments, 
just one neighbor bond (in a given direction) is occu- 
pied by two paired segments as well. We can call it the 
stacking probability. Due to the Markovian nature of the 
Bethe lattice, the probability of having a double chain 
stretch of length I in the given direction is 7r (1 — tt). 



FIG. 5: Average stacking length as a function of temperature 
(A vs 1//3) in the single-molecule limit, for different values of 
the stacking ratio 7//?. Dashed lines denote discontinuities; 
thin solid lines connect transition values; a dash-dotted line, 
determined by Eqs. (fl8|) and (f2Tj) , connects O-point values. 

The average stacking length is therefore 

00 1 
A = ^^- 1 (1-tt) = - . (18) 

1 — ' t — TT 

1=1 

Considering the recursion equation (|B3|) . we can derive 
the stacking probability as the ratio between the weight 
of the stacked configuration (j)e 7 J02W^ 1 and the to- 
tal weight of the configurations compatible with the two 
paired segments (coinciding with the left-hand side, at 
a fixed point of the recursion equations). Remembering 
also the expression (j4} for /12, we easily obtain 

n = q- 1 ke 2 » +0 w*-\ (19) 

where of course q and wq are available from the numerical 
solution. It is also useful to derive an explicit expression 
for 7T at the second order O-I phase boundary, in order to 
avoid taking limits numerically. Performing basically the 
same calculation with Eq. (|C2[) . taking the limit x, y — ► 0, 
and making use of Eq. (|C6[) . we obtain 



Moreover, comparing this equation with Eq. (|13[) . we ob- 
tain, at the point, the following simple relation 

ic = e /k. (21) 

The results are reported in Fig. El again for 7//? — 
0, 0.2, 1. The most interesting features appear in the coil 
and paired states. In particular, we can observe that the 
stacking length is constant with respect to temperature 
if 7 = 0, i.e., in the absence of the stacking effect. On 
the contrary, even a very small stacking energy makes 
the stacking length increase upon decreasing tempera- 
ture. Since the length scale is logarithmic and the tem- 
perature scale is "inverse" , straight lines indicate that A 
is exponential in (3 in these phases. As a consequence, 



7 



0.00 0.05 




0.0 0.2 0.4 0.6 0.8 1.0 



y/p 

FIG. 6: Transition temperatures as a function of the stack- 
ing ratio vs 7//3) in the single- molecule limit. Solid lines 
denote first order transitions; dashed lines denote second or- 
der transitions. The inset displays the region enclosed in the 
small rectangle. 

in the presence of the stacking effect, the stacking length 
diverges as temperature goes to zero, so that the ground 
state of the model can be thought of as a unique double 
chain (hairpin). Let us also notice that, in the coil state, 
the stacking length does not vanish at any finite tempera- 
ture value, unlike the fraction of paired segments. These 
results do not disagree, meaning that, if a (rare) contact 
is formed, it has nevertheless a probability of not being 
isolated. 

We have already described the overall behavior of 
the phase transitions in the single-molecule limit, as 
a function of the stacking ratio. As this parameter 
increases, the temperature decreases, whereas the 
molten-paired transition temperature increases, and the 
transition changes from second to first order. For the sake 
of completeness, we report in Fig. [6] the transition tem- 
peratures as a function of the stacking ratio. The tran- 
sition line is given analytically by Eq. (flU)) , whereas the 
molten-paired transition has been determined numeri- 
cally. In the latter transition, we recover the continuous 
regime at very low stacking values and the discontinu- 
ous regime at higher stacking values. Quantitatively, the 
boundary between the two regimes is found to occur at 
7//3 « 0.0210. 



IV. DISCUSSION AND CONCLUSIONS 

In this paper, we have investigated a 2-tolerant poly- 
mer model on the Bethe lattice, with both contact and 
stacking interactions. The model is expected to mimic 
some qualitative features of conformational transitions of 
RNA molecules. The most striking results are the pres- 
ence of a low temperature transition to a fully-paired 
phase and the effect of stacking, which turns out to 
drive such transition from second to first order. We 
have already questioned, throughout the paper, whether 
these results have some relevance to the denaturation 



transition of real RNA. The most important warning 
concerns the fact that we completely neglect chemical 
heterogeneities, which are indeed present in RNA. As 
a consequence, we observe that the fully-paired state 
does not correspond to a well defined secondary struc- 
ture, but indeed to a variety of structures. On the con- 
trary, several investigations proposed in the literature 
take into account randomly distributed heterogeneous 
sequences [H [H [H H Even in this case, the 

low temperature glasslike phase does not correspond to 
a fixed structure, but the general claim is that it could 
describe average RNA properties. On this kind of mod- 
els, the only work we are aware of, which performs a 
systematic investigation as a function of the strength of 
the stacking interaction, is one by Burghardt and Hart- 
mann [27J , In the cited paper, the authors do not find 
any evidence of a change in the order of the (temperature- 
induced) denaturation transition. Nevertheless, in a pre- 
vious work, Zhou and Zhang [34| had observed that an 
increasing stacking energy could change the order of a 
force-induced denaturation. This result has been actu- 
ally criticized by Miiller [26] , who argued that the appar- 
ent first order transition was rather to be interpreted as a 
sharp cross-over. It is important to notice, however, that 
all the cited investigations neglect the effect of excluded 
volume. On the contrary, the Bethe lattice approxima- 
tion is partially able to account for excluded volume, by 
imposing local self-avoidance constraints. This may be 
indeed a reason for our qualitatively different results. In 
order to investigate this issue in more detail, it would 
be interesting to extend the Bethe lattice analysis to the 
case of a random heterogeneous sequence, making use 
the recently proposed cavity method [3l| , along the lines 
traced by Montanari, Miiller, and Mezard, for the self- 
avoiding heteropolymer [H, [36| . 

Let us also compare our results with the Monte Carlo 
simulations by Baiesi, Orlandini, and Stella [TTJ] . In the 
cited work, the authors investigate a 2-tolerant trail with 
contact energy only, on the fee lattice, fully taking into 
account excluded volume. As the polymer is assigned an 
orientation (whence the term "trail"), only antiparallel 
contacts are allowed. Let us notice, by the way, that in 
our treatment we have not introduced orientation explic- 
itly. Nevertheless, it is possible to show that the latter, 
together with the constraint on antiparallel contacts, are 
equivalent to a simple renormalization of the partial par- 
tition functions, with no effect on observable quantities. 
Of course, we cannot expect that any result concerning 
critical exponents could be reproduced in the framework 
of our mean-field-like approach. Nevertheless, it is no- 
ticeable that, for k = 11 (fee lattice) and 7 = (contact 
energy only), we predict a temperature (/3 « 1.7513) 
not so far from the Monte Carlo result (/3 « 1.9). Apart 
from this result, the fact itself that we observe a 0-likc 
transition may rise some interest. Indeed, an important 
result of Ref.[ll|is that, if pseudoknots (i.e., tertiary RNA 
structure) are forbidden, then the ©-like transition is 
replaced by a smooth cross-over. We wonder how the 
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Bethe lattice approach can reproduce (at least qualita- 
tively) the situation with pseudoknots, since the model is 
embedded on a treelike structure. Our tentative expla- 
nation is based on the previously mentioned definition of 
the Bethe lattice in terms of random graphs [31( , rather 
than the more usual and quick definition ("the infinite 
interior of the infinite Cayley tree" @). In the former 
picture, pseudoknots could be realized via large loops 
present in the graphs, although in the current treatment 
we do not have any control on them. Let us finally notice 
that, conversely, just the presence of only large loops in 
the Bethe lattice might create some artifact in the de- 
scription of RNA statistics, especially with a constraints 
which disallows zero-length hairpin turns, as discussed in 
Sec. HIl We find these issues worth a deeper investigation, 
which we would like to devote a future work to. 



APPENDIX A: BETHE FREE ENERGY 

In this appendix we first give a derivation of the re- 
cursion equation ^ as a stationarity condition for the 
variational Bethe free energy, and then prove the valid- 
ity of the expression (jlip for the equilibrium free energy. 

Let us consider a Bethe lattice with c coordination 
number, and assume that a configuration variable rij is 
associated to each lattice bond i. Let us also assume that 
the Hamiltonian of the system is the one given in Eq. |T]), 
which includes couplings among the c bonds of each site. 
Let the coupling terms be invariant under permutation 
of the configuration variables. Expecting a homogeneous 
thermodynamic state (i.e., that all local probability dis- 
tributions are equal), we write the Bethe free energy per 
site as 




(Al) 

where {rii} stands for rii, . . . , n c , while p n and P{ ni } de- 
note respectively the single-variable and the c-variable 
probability distributions. Accordingly, and J2{m} 
denote the sums over possible values of the configuration 
variables. Let us notice that, as far as the entropic part 
is concerned, the latter term of the Bethe free energy 
can be thought of as a correction over the former term, 
such that the mean field free energy is recovered, when 
the joint probability distribution factorizes. Equilibrium 
probability distributions can be determined as minima of 
the Bethe free energy, satisfying suitable normalization 
and compatibility constraints. By "compatibility", we 
mean that marginalizations of the joint probability dis- 
tribution must give the single-variable distribution, ac- 
cording to the relations 

Pn, = ^2 i = 1, . . . , c, (A2) 

{"jli^i 

where the sum runs over possible values of the configura- 
tion variables n±, . . . ,n c , except n,-. We thus have a con- 



strained optimization problem, for which, in the frame- 
work of the Lagrange multiplier method, we can solve 
analytically stationarization with respect to probability 
distributions. Doing so, the latter can be written as a 
function of suitable variables z, Z, and w n , which corre- 
spond to Lagrange multipliers, and are to be determined 
in order to satisfy the constraints. We obtain 

Pn = z^e^wl, (A3) 

C 

P{n,} = Z- 1 e- H ^l[w ni . (A4) 

i=l 

Let us notice that w n plays the role of the (normal- 
ized) partial partition function introduced in the text, 
whereas the two constants z and associated to the nor- 
malization constraints, are easily determined as a func- 
tion of uu n . Moreover, imposing the compatibility con- 
straints (|A2jl . one obtains the following recursion equa- 
tion 

c 

w nt =q- 1 e- h ^ ]T e-^-^JJ^, (A5) 

{ n j}j^i j=l 

where 

q = Z/z. (A6) 

It is possible to show that, because of a slight redundance 
of the constraints, one can choose the constant q at each 
iteration in an arbitrary way, for instance by imposing the 
normalization condition ^ n w n — 1 , without affecting 
"observable" quantities. Let us also notice that the c 
compatibility conditions (|A2jl would require in principle 

c sets of "Lagrange multipliers" Wn\ for i = l,...,c. 
Nevertheless, one can show that, due to invariance of 
Hi ni \ under permutation, all the sets must be equal to a 
single one, which we have just denoted as w n . 

Let us now derive the simple free energy formula pip 
presented in the text. Let us plug the expressions (|A3[) 
and (IA4j) for the equilibrium probability distributions 
into the logarithmic terms of the variational free en- 
ergy (|A1[) . By simple algebra, we obtain 




Since at equilibrium the normalization and compatibility 
constraints are satisfied, the previous expression imme- 
diately simplifies to 

F=-\nZ+^\nz. (A8) 

Taking into account that there are c/2 bonds per site, 
that c = k + 1, and making use of Eq. (|A6[) , we finally 
obtain Eq. pip for the equilibrium free energy per bond. 
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APPENDIX B: RECURSION EQUATIONS 

In this appendix we write an explicit form for the 
recursion equations ([6]), introducing the values of the 
couplings iy« Dl ... j determined according to criteria ex- 
plained in Tab. Q] Moving the constant q and the single- 
variable energies h n to the left-hand sides, and remem- 
bering that ho — 0, we obtain 



qw 



m=0 
m^l,2 



El»\ m k-m , / ft \ -y 2 fc-2 



qe hl w 1 



qe h2 W2 



2/ ' ^ \ m 

711 — 



fc\ ^ A - 1 



m=0 



^- 2 ~ m , (Bl) 



E 



m„,,fc— 2— m 




(B2) 



fc-1 



\ o ^ / A^ 2 



m=0 



,fc— 2— m 



J 2 ^0 



(B3) 



Let us give a physical explanation of the various terms 
appearing on the right-hand sides. They have to take 
into account all the allowed configurations of k bonds 
sharing one site with a given bond, whose configuration 
is fixed (n = 0, 1, 2 for the three equations, respectively). 
In the first equation, the fixed configuration is n — 
(empty bond). In the right-hand side, the first two terms 
refer to configurations with m = 0, . . . , k bonds occupied 
by paired segments, and k — m empty bonds. As ex- 
plained in the text, the case m = 1 is forbidden, and 
the case m — 2 is treated separately, since it has to 
take into account a stacking energy contribution. The 
third term deals with the case of 2 bonds occupied by 
unpaired segments, m = 0, . . . , k — 2 by paired segments, 
and k — 2 — m empty bonds. In the second equation, 
the fixed bond configuration is n = 1 (bond occupied by 
an unpaired segment). Therefore, in the right-hand side, 
there is always one bond occupied by an unpaired seg- 
ment, together with m = 0, . . . , A; — 1 occupied by paired 
segments, and k—l — m empty bonds. In the third equa- 
tion, the fixed configuration is n = 2 (bond occupied by 
paired segments). In the right-hand side, the first two 
terms refer to configurations with m = 0, . . . , k bonds 
occupied by paired segments, and k — m empty bonds. 
As in the first equation, the case m = is forbidden, 
and the case m = 1 is treated separately, because of the 
stacking energy contribution. The third term deals with 
2 bonds occupied by unpaired segments, m = 0, . . . , k — 2 
by paired segments, and k — 2 — m empty bonds. 

The above form of the recursion equations can be fur- 
ther simplified, making use of the binomial expansion. 



By simple algebra, we finally obtain 



qw = {w + w 2 ) k + Uf (w + w 2 ) k (B4) 



qe hl wi 



W\ (w + W 2 



,fc-l 



qe 2 w 2 = (w + w 2 ) ' + ( 2 )w 1 (w + w 2 ) 



fc-2 



(B5) 



(B6) 



APPENDIX C: THETA POINT 

Hereafter, we report the derivation of Eq. (|13p. i.e., 
the analytical expression for the transition tempera- 
ture. Eq. (fT2"j) . i.e., the second order O-I phase boundary, 
comes out as a by-product of this derivation. Let us first 
define the ratios x = w\/wq and y = w 2 /wq, for which 
we can easily derive two recursive equations from (|B4[) . 
(IB51). and (El 



(iH 1 + y) k ~\ (C1) 

_ h2 (1 + y) k + Q> 2 (1 + y) k - 2 + (g - l)y 

d 

(C2) 



where 



(1 + y) k + Q)x 2 (1 + yf-* Q y + Q) (e 7 - l)y\ 

(C3) 

From (|Cip and (|C3|) , assuming that x ^ 0, i.e., that we 
are in the I phase, we obtain 

(C4) 

Moreover, in the y — » limit, i.e., very close to the 
boundary with the O phase, we can write 



{ke» -l) + (Jfe e " + k-2)y + 0{y 2 ). (C5) 



Since we have observed from the numerics that such 
boundary is second order, we have that y — > should 
imply also x — > 0. We then argue that the zeroth or- 
der term on the right-hand side of the previous equation 
must vanish. In this way we obtain Eq. (|12p . Plugging 
this equation into the previous one, we obtain 



x 2 = (k-l)y + 0(y 2 ), 



(C6) 
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whereas, remembering also Eq. ((4]), Eq. (|C2[) becomes 
„g fc(e 7 + l) - 1 



y = e 



-y + o(y'). 



(C7) 



Now, in order to allow the possibility that, along the 
phase boundary, there can exist some point in which y is 



vanishingly small but not zero (i.e., a tricritical point), 
we have to equate the first order coefficients on the two 
sides of the previous equation. By simple algebra, we 
obtain the Q point condition (flU]) . 
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